DOCKET NO.: MSFT-0580/1 67506.02 

Application No.: 09/900,059 

Office Action Dated: September 17, 2004 



PATENT 

REPLY FILED UNDER EXPEDITED 
PROCEDURE PURSUANT TO 



37 CFR§ 1.116 

This listing of claims will replace all prior versions, and listings, of claims in the application. 
Listing of Claims: 

1 . (original) A method for automatically classifying consonance of audio data, comprising: 
applying audio data to a peak detection process; 

detecting the location of at least one prominent peak represented by the audio data in the 
frequency spectrum and determining the energy of the at least one prominent peak; 

storing the location of the at least one prominent peak and the energy of the at least one 
prominent peak into at least one output matrix; 

applying the data stored in said at least one output matrix to critical band masking 
filtering; 

applying the data stored in said at least one output matrix to a peak continuation process; 

and 

applying the data stored in said at least one output matrix to an intervals calculation 
process where the frequency of ratios between peaks are stored into an output vector for the 
audio data being classified. 

2. (original) A method according to claim 1, wherein the audio data is divided into frames, 
and the method is performed frame by frame. 

3. (original) A method according to claim 2, wherein the frame by frame approach includes 
bin differencing to calculate frame derivatives to facilitate the detection of peaks. 

4. (original) A method according to claim 2, wherein the number of peaks detected in said 
application of the peak detection process is limited by a pre-defined parameter. 

5. (original) A method according to claim 1, further comprising performing Nth order 
interpolation on the location of the at least one prominent peak and the energy of the at least one 
prominent peak to increase precision of the location and energy values for the peak. 
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6. (original) A method according to claim 1, further comprising applying the output vector 
to a classification stage which determines at least one of (1) at least one consonance value and 
(2) at least one consonance class that describes the audio data. 

7. (original) A method according to claim 1, where the frequency of ratios between peaks 
are stored into an output vector that is 1 x 24. 

8. (original) A method according to claim 2, wherein the peak continuation process keeps 
track of peaks that last more than a predetermined number of frames 

9. (original) A method according to claim 8, wherein the peak continuation process fills in 
a peak when the peak is missed in a previous frame. 

10. (original) A method according to claim 1, wherein said critical band masking filtering 
removes a peak that is masked by surrounding peaks with more energy. 

1 1 . (original) A method according to claim 10, wherein said critical band masking filtering 
removes a peak when at least one of a lower frequency peak and a higher frequency peak have 
greater energy. 

12. (original) A method according to claim 10, wherein said critical band masking filters are 
scalable so that the amount of masking is scalable. 

13. (original) A method according to claim 1, wherein said storing includes providing an 
output of the peak detection and interpolation stage in two matrices, one holding the location of 
the at least one prominent peak, and the second holding the respective energy of the at least one 
prominent peak. 

14. (original) A method according to claim 1, wherein the audio data is formatted according 
to pulse code modulated format. 
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15. (original) A method according to claim 14, wherein the audio data is previously in a 
format other than pulse code modulated format, and the method further comprises converting the 
audio data to pulse code modulated format from the other format. 

16. (original) The method of claim 1, further comprising converting the input audio data 
from the time domain to the frequency domain. 

17. (original) A method according to claim 16, wherein said converting of the input audio 
data signal from the time domain to the frequency domain includes performing a fast fourier 
transform on the audio data. 

18. (previously presented) A computer readable medium bearing computer executable 
instructions for: 

applying audio data to a peak detection process; 

detecting the location of at least one prominent peak represented by the audio data in the 
frequency spectrum and determining the energy of the at least one prominent peak; 

storing the location of the at least one prominent peak and the energy of the at least one 
prominent peak into at least one output matrix; 

applying the data stored in said at least one output matrix to critical band masking 
filtering; 

applying the data stored in said at least one output matrix to a peak continuation process; 

and 

applying the data stored in said at least one output matrix to an intervals calculation 
process where the frequency of ratios between peaks are stored into an output vector for the 
audio data being classified. 

19. (previously presented) A modulated data signal carrying computer executable 
instructions for: 

applying audio data to a peak detection process; 

detecting the location of at least one prominent peak represented by the audio data in the 
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frequency spectrum and determining the energy of the at least one prominent peak; 

storing the location of the at least one prominent peak and the energy of the at least one 
prominent peak into at least one output matrix; 

applying the data stored in said at least one output matrix to critical band masking 
filtering; 

applying the data stored in said at least one output matrix to a peak continuation process; 

and 

applying the data stored in said at least one output matrix to an intervals calculation 
process where the frequency of ratios between peaks are stored into an output vector for the 
audio data being classified. 

20. (previously presented) At least one computing device comprising one or more 
subsystems for: 

applying audio data to a peak detection process; 

detecting the location of at least one prominent peak represented by the audio data in the 
frequency spectrum and determining the energy of the at least one prominent peak; 

storing the location of the at least one prominent peak and the energy of the at least one 
prominent peak into at least one output matrix; 

applying the data stored in said at least one output matrix to critical band masking 
filtering; 

applying the data stored in said at least one output matrix to a peak continuation process; 

and 

applying the data stored in said at least one output matrix to an intervals calculation 
process where the frequency of ratios between peaks are stored into an output vector for the 
audio data being classified. 

21-33. (canceled) 
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